{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Analyze"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The hypertools analyze function allows you to perform complex analyses (normalization, dimensionality reduction and alignment) in a single line of code!\n",
    "\n",
    "(Note that the order of operation is always the following normalize -> reduce -> alignment)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Import packages"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import hypertools as hyp\n",
    "import seaborn as sb\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Load your data\n",
    "\n",
    "First, we'll load one of the sample datasets. This dataset is a list of 2 `numpy` arrays, each containing average brain activity (fMRI) from 18 subjects listening to the same story, fit using Hierarchical Topographic Factor Analysis (HTFA) with 100 nodes.  The rows are timepoints and the columns are fMRI components. \n",
    "\n",
    "See the [full dataset](http://dataspace.princeton.edu/jspui/handle/88435/dsp015d86p269k) or the [HTFA article](https://www.biorxiv.org/content/early/2017/02/07/106690) for more info on the data and HTFA, respectively. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "geo = hyp.load('weights_avg')\n",
    "weights = geo.get_data()\n",
    "print(weights[0].shape) # 300 TRs and 100 components\n",
    "print(weights[1].shape)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can see that the elements of weights each have the dimensions (300,100). We can further visualize the elements using a heatmap."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "for x in weights:\n",
    "    sb.heatmap(x)\n",
    "    plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Normalization"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here is an example where we z-score the columns within each list:\n",
    "\n",
    "Normalize accepts the following arguments, as strings:\n",
    "+ ‘across’ - z-scores columns across all lists (default)\n",
    "+ ‘within’ - z-scores columns within each list\n",
    "+ ‘row’ - z-scores each row of data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "norm_within = hyp.analyze(weights, normalize='within')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can again visualize the data (this time, normalized) using heatmaps."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "for x in norm_within:\n",
    "    sb.heatmap(x)\n",
    "    plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Normalize and reduce"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To easily normalize and reduce the dimensionality of the data, pass the normalize, reduce, and ndims arguments to the `analyze` function. The normalize argument, outlined above, specifies how the data should be normalized. The `reduce` argumemnt, specifies the desired method of reduction. The `ndims` argument (int) specifies the number of dimensions to reduce to.\n",
    "\n",
    "Supported dimensionality reduction models include: PCA, IncrementalPCA, SparsePCA, MiniBatchSparsePCA, KernelPCA, FastICA, FactorAnalysis, TruncatedSVD, DictionaryLearning, MiniBatchDictionaryLearning, TSNE, Isomap, SpectralEmbedding, LocallyLinearEmbedding, and MDS. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "norm_reduced = hyp.analyze(weights, normalize='within', reduce='PCA', ndims=3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can again visualize the data using heatmaps."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "for x in norm_reduced:\n",
    "    sb.heatmap(x)\n",
    "    plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Finer control"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For finer control of the model parameters, `reduce` can be a dictionary with the keys `model` and `params`. See scikit-learn specific model docs for details on parameters supported for each model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "reduce={'model' : 'PCA', 'params' : {'whiten' : True}} # dictionary of parameters\n",
    "reduced_params = hyp.analyze(weights, normalize='within', reduce=reduce, ndims=3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can again visualize the data using heatmaps."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "for x in reduced_params:\n",
    "    sb.heatmap(x)\n",
    "    plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Normalize, reduce, and align"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally, we can normalize, reduce and then align all in one step.\n",
    "\n",
    "The align argument can accept the following strings:\n",
    "+ 'hyper' - implements [hyperalignment](http://haxbylab.dartmouth.edu/publications/HGC+11.pdf) algorithm\n",
    "+ 'SRM' - implements shared response model via [Brainiak](http://brainiak.org)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "norm_red_algn = hyp.analyze(weights, normalize='within', reduce='PCA', ndims=3, align='SRM')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Again, we can visualize the normed, reduced, and aligned data using a heatmap."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "for x in norm_red_algn:\n",
    "    sb.heatmap(x)\n",
    "    plt.show()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
